Evaluation of term utility functions for very short multidocument summaries

نویسندگان

  • Alexander K. Seewald
  • Christian Holzbaur
  • Gerhard Widmer
چکیده

We describe results from an application for relevance assessment in a setting related to multi-document summarization. For the task of characterizing given document collections by a short list of relevant terms, we have proposed the term utility function PxR. The measure is competitive to a variety of utility functions commonly used in text mining. Our function incorporates a user-definable parameter which allows for explicit, continuous trade-off between precision and recall, which was preferred by our users over the more opaque term utility functions from text mining. The Fβ measure is similar but not identical to our measure and will also be discussed. Despite our users’ preference for a user-definable parameter, the improvement by setting different user-defined parameter values for each document collection are limited, and a static value for the parameter works almost as well. This seems to be true for the Fβ measure as well. A simple measure, SR, also performs competitively. In light of this evidence, a user-definable parameter seems to be unnecessary to achieve competitive performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine and Human Performance for Single and Multidocument Summarization

coherency—and be able to draw the “best” information from a set of documents. Automatic single-document text summarization1 has been an active research area since the 1950s, with a renaissance of approaches since the 1990s. Human single-document summarization is well defined when guidelines and recommendations drive performance.2,3 System-generated single-document summaries, while not always ma...

متن کامل

Experiments with CST-Based Multidocument Summarization

Recently, with the huge amount of growing information in the web and the little available time to read and process all this information, automatic summaries have become very important resources. In this work, we evaluate deep content selection methods for multidocument summarization based on the CST model (Cross-document Structure Theory). Our methods consider summarization preferences and focu...

متن کامل

A Cosine Maximization-Minimization approach for User-Oriented Multi-Document Update Summarization

This paper presents a User-Oriented MultiDocument Update Summarization system based on a maximization-minimization approach. Our system relies on two main concepts. The first one is the cross summaries sentence redundancy removal which tempt to limit the redundancy of information between the update summary and the previous ones. The second concept is the newness of information detection in a cl...

متن کامل

Automatic multidocument summarization of research abstracts: Design and user evaluation

The purpose of this study was to develop a method for automatic construction of multi-document summaries of sets of research abstracts that may be retrieved by a digital library or search engine in response to a user query. Sociology dissertation abstracts were selected as the sample domain in this study. A variable-based framework was proposed for integrating and organizing research concepts a...

متن کامل

Sub-Event Based Multi-Document Summarization

The production of accurate and complete multiple-document summaries is challenged by the complexity of judging the usefulness of information to the user. Our aim is to determine whether identifying sub-events in a news topic could help us capture essential information to produce better summaries. We used six methods to create multi-document summaries and then compared them to find which method ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Applied Artificial Intelligence

دوره 20  شماره 

صفحات  -

تاریخ انتشار 2006